Refining Aggregate Conditions in Relational Learning

نویسندگان

  • Celine Vens
  • Jan Ramon
  • Hendrik Blockeel
چکیده

In relational learning, predictions for an individual are based not only on its own properties but also on the properties of a set of related individuals. Many systems use aggregates to summarize this set. Features thus introduced compare the result of an aggregate function to a threshold. We consider the case where the set to be aggregated is generated by a complex query and present a framework for refining such complex aggregate conditions along three dimensions: the aggregate function, the query used to generate the set, and the threshold value. The proposed aggregate refinement operator allows a more efficient search through the hypothesis space and thus can be beneficial for many relational learners that use aggregates. As an example application, we have implemented the refinement operator in a relational decision tree induction system. Experimental results show a significant efficiency gain in comparison with the use of a less advanced refinement operator.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

IRSJ: incremental refining spatial joins for interactive queries in GIS

An increasing number of emerging web database applications deal with large georeferenced data sets. However, exploring these large data sets through spatial queries can be very time and resource intensive. The need for interactive spatial queries has arisen in many applications such as Geographic Information Systems (GIS) for efficient decision-support. In this paper, we propose a new interacti...

متن کامل

Using neural networks for relational learning

Relational learners need to be able to handle the information contained in a set of related tuples. Most current relational learners are biased either towards the use of aggregate functions that summarize that set, or towards checking the existence of specific kinds of elements in that set. Learning patterns that contain a combination of both is a challenging task. In this paper we introduce a ...

متن کامل

A Random Forest Approach to Relational Learning

Random forest induction is an ensemble method that uses a random subset of features to build each node in a decision tree. The method has been shown to work well when many features are available. This certainly is the case in relational learning, especially when aggregate functions, combined with selection conditions on the set to be aggregated, are included in the feature space. This paper pre...

متن کامل

A Toolbox for Learning from Relational Data with Propositional and Multi-instance Learners

• uses SQL aggregate functions like SUM, MIN, MAX, AVG and computed standard deviation, quartile and range to capture relational information • for each value of a nominal column a new attribute is introduced, containing the number of occurrences • pairs of attributes (one is nominal) are used as GROUP BY conditions for additional aggregations • determines relations between tables based on name ...

متن کامل

Refining Hygienic Macros for Modules and Separate Compilation

Genuine differences in the treatment of identifiers in block-structured languages and those that provide qualified names for accessing components of modules or aggregate data structures invalidate some of the assumptions hygienic macro systems are based on. We will investigate how these assumptions have to be changed, and the consequences for the construction of hygienic macro expanders. Macro ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006